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Abstract. Tiie integration of reasoning and computation services across 
system and language boundaries is a challenging problem of computer 
science. In this paper, we use integration for the scenario where we have 
two systems that we integrate by moving problems and solutions between 
them. While this scenario is often approached from an engineering per- 
spective, we take a foundational view. Based on the generic declarative 
language MMT, we develop a theoretical framework for system integra- 
tion using theories and partial theory morphisms. Because MMT permits 
representations of the meta-logical foundations themselves, this includes 
integration across logics. We discuss safe and unsafe integration schemes 
and devise a general form of safe integration. 



1 Introduction 

The aim of integrating Computer Algebra Systems (CAS) and Deduction Sys- 
tems (DS) is twofold: to bring the efficiency of CAS algorithms to DS (without 
sacrificing correctness) and to bring the correctness assurance of the proof theo- 
retic foundations of DS to CAS computations (without sacrificing efficiency). In 
general, the integration of computation and reasoning systems can be organized 
either by extending the internals of one system by methods (data structure and 
algorithms) from the other, or by passing representations of mathematical ob- 
jects and system state between independent systems, thus delegating parts of the 
computation to more efficient or secure platforms. We will deal with the latter 
approach here, which again has two distinct sets of problems. The first addresses 
engineering problems and revolves about communication protocol questions like 
shared state, distributed garbage collection, and translating input syntaxes of 
the different systems. The syntax questions have been studied extensively in the 
last decade and led to universal content markup languages languages for math- 
ematics like MathML and OpenMath to organize communication. The second 
set of problems comes from the fact that passing mathematical objects between 
systems can only be successful if their meaning is preserved in the communica- 
tion. This meaning is given via logical consequence in the logical system together 
with the axioms and definitions of (or inscribed in) the respective systems. 
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We will address this in the current paper, starting from the observation 
that content level communication between mathematical systems, to be effec- 
tive, cannot always respect logical consequence. On the other hand, there is the 
problem of trusting the communication itself, that boils down to studying the 
preservation of logical consequence. Surprisingly, this problem has not received 
in the literature the attention it deserves. Moreover, the problem of faithful safe 
communication, which preserves not only the consequence relation but also the 
intuitive meaning of a formal object, is not even always perceived as a structural 
problem of content level languages. 

For example, people with a strong background in first order logic tend to 
assume that faithful and safe communication can always be achieved simply by 
strengthening the specifications; others believe that encoding logical theories is 
already sufficient for safe communication and do not appreciate that the main 
problem is just moved to faithfulness. Several people from the interactive theo- 
rem proving world have raised concerns about trusting CAS and solved the issue 
by re-checking the results or the traces of the computation (here called proof 
sketches). Sometimes this happens under the assumption that the computation 
is already correct and just needs to be re-checked, neglecting the interesting 
case when the proof sketch cannot be refined to a valid proof (or computation) 
without major patching (see |Del99] for a special case). 

In this paper, we first give a categorization of integration problems and solu- 
tions. Then we derive an integration framework by adding some key innovations 
to the MMT language, a Module system for Mathematical Theories described in 
|RK11] . MMT can be seen as a generalization of OpenMath and as a formalized 
core of OMDoc. Of course, any specific integration task requires a substantial 
amount of work — irrespective of the framework used. But our framework guides 
and structures this effort, and can implement all the generic aspects. In fact, 
current integration tasks typically involve setting up an ad-hoc framework for 
exactly that reason. 

We sketch the MMT framework first in Sect. [2] In Sect. [31 we analyze the 
integration problem for mathematical systems from a formal position. Then we 
describe how integration can be realized our framework using partial MMT the- 
ory morphisms in Sect. E) Finally, Sect. [5] discusses related work and Sect. [6] 
concludes the paper. 

2 The MMT Language 

Agreeing on a common syntax like OpenMath is the first step towards system 
integration. This already enables a number of structural services such as storage 
and transport or editing and browsing that they do not depend on the semantics 
of the processed expressions. But while we have a good solution for a joint syn- 
tax, it is significantly harder to agree on a joint semantics. Fixing a semantics 
for a system requires a foundational commitment that excludes systems based 
on other foundations. The weakness of the (standard) OpenMath content dic- 
tionaries can be in part explained by this problem: The only agreeable content 
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dictionaries are those where any axioms (formal or informal) are avoided that 
would exclude some foundations. 

MMT was designed to overcome this problem by placing it in between frame- 
works like OpenMath and OMDoc on the one hand and logical frameworks like 
LF and CIC on the other hand. The basic idea is that a system's foundation 
itself is represented as a content dictionary. Thus, both meta and object lan- 
guage are represented uniformly as MMT theories. Furthermore, theory mor- 
phisms are employed to translate between theories, which makes MMT expres- 
sive enough to represent translation between meta-languages and thus to sup- 
port cross-foundation integration. As MMT permits the representation of logics 
as theories and internalizes the meta-relation between theories, this provides 
the starting point to analyze the cross-foundation integration challenge within 
a formal framework. 

Syntax We will work with a very simple fragment of the MMT language that 
suffices for our purposes, and refer to |RK11| for the full account. It is given by 
the following grammar where [— ] denotes optional parts and T, u, c, and x are 
identifiers: 



Theory graph 


7 


:= . 1 7, T {d} 1 7, V 


Theory body 


d 


■\d,c[.0][^0'] 


Morphism body 


a 


:= ■ \a, 


Objects 


O 


:— OpenMath objects 


Morphisms 




:= V 1 idT 1 Ai ° Ai 


Contexts 


c 


■ ^1 ■ 0\ , . . . , Xji . Oyi 


Substitutions 


s 


:= xi := Oi, . . . ,x„ := O 



In particular, we omit the module system of MMT that permits imports between 
theories. 

T = {i?} declares a theory T with meta-theory L defined by the list 
of symbol declarations. The intuition of meta-theories is that L is the meta- 
language that declares the foundational symbols used to type and define the 
symbol declarations in i?. 

All symbol declarations in a theory body are of the form c : O ~ O' . This 
declares a new symbol c where both the type O and the definiens O' are optional. 
If given, they must be T-objects, which are defined as follows. A symbol is called 
accessible to T if it is declared in T or accessible to the meta-theory of T. An 
OpenMath object is called a T- object if it only uses symbols that are accessible 
to T. 

Example 1. Consider the natural numbers defined within the calculus of con- 
structions (see [BC04j ). We represent this in MMT using a theory CIC declaring 
untyped, undefined symbols such as Type, A and — Then Nat is defined as a 
theory with meta-theory CIC giving symbol declarations such as N : OMS(cd = 
CIC, name = Type) or succ : OMA(DMS(cd = CIC, name =-^), DMS(cd = Nat, name = 
N), OMS(cd = Nat, name = N)). 
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S- contexts C are lists of variable declarations . . . ,Xi : Oi, . . . for S'-objects Oi. 
S- substitutions s for an S'-context C are lists of variable assignments . . . ,Xi := 
Oi, . . .. In an object O in context C, exactly the variables in C may occur freely; 
then for a substitution s for C, we write 0[s] for the result of replacing every 
free occurrence of Xi with o^. 

Relations between MMT theories are expressed using theory morphisms. 
Given two theories S and T, a theory morphism from 5 to T is declared us- 
ing : 5* — > T =: {cr}. Here a must contain one assignment c i— >■ O for every 
symbol c declared in the body of S, and for some T-objects O. If S and T have 
meta-theories L and M, then v must also include a meta-morphism I : L ^ M . 

Every v : S* — )■ T = {cr} induces a homomorphic extension v{—) that 
maps 5-objects to T-objects. w(— ) is defined by induction on the structure of 
OpenMath objects. The base case v{c) for a symbol c is defined as follows: If c 
is accessible to the meta-theory of S, we put w(c) := Z(c); otherwise, we must 
have c ^ O in cr, and we put v{c) O. v{—) also extends to contexts and 
substitutions in the obvious way. 

By experimental evidence, all declarative languages for mathematics cur- 
rently known can be represented faithfully in MMT. In particular, MMT uses 
the Curry-Howard representation |CF58IHow80j of propositions as types and 
proofs as terms. Thus, an axiom named a asserting is a special cases of a 
symbol a of type F, and a theorem named t asserting F with proof p is a special 
case of a symbol t with type F and definiens p. All inference rules needed to 
form p, are symbols declared in the meta-theory. 



Semantics The use of meta-theories makes the logical foundation of a system 
part of an MMT theory and makes the syntax of MMT foundation-independent. 
The analogue for the semantics is more difficult to achieve: The central idea is 
that the semantics of MMT is parametric in the semantics of the foundation. 

To make this precise, we call a theory without a meta-theory foundational. 
A foundation for MMT consists of a foundational theory L and two judgments 
for typing and equality of objects: 

— 7; C O : O' states that O is a T-object over C typed by the T-object O', 

— \-T O ~ O' states the equality of two T-objects over C, 

defined for an arbitrary theory T declared in 7 with meta-theory L. In particu- 
lar, MMT does not distinguish terms, types, and values at higher universes — 
all expressions are OpenMath objects with an arbitrary binary typing relation 
between them. We will omit C when it is empty. 

These judgments are similar to those used in almost all declarative languages, 
except that we do not commit to a particular inference system — all rules are 
provided by the foundation and are transparent to MMT except for the rules for 
the base cases of T-objects: 

T = in 7 c : O = O' in 1? T = {i?} in 7 c : O = O' 
T: T= 

J 'tt c : O J \-T c = O' 
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and accordingly if O or O' arc omitted. For example, adding the usual rules for 
the calculus of constructions yields a foundation for the foundational theory CIC. 
Given a foundation, MMT defines (among others) the judgments 

— 7 h ^ : 5' — )■ T states that /i is a theory morphism from S to T, 

— if 7 h /ii : S* T, then 7 h /ii = /i2 states that hj- /ii(c) = /i2(c) for all 
symbols c that are accessible to S*, 

— 7 hs s : C states that s is a well-typed for C, i.e., for every Xi :— Oi in s and 
Xi : Oi in C, we have 7 I-5 : Oi, 

— 7 h 5 states that 5 is a well-formed theory graph. 

In the sequel, we will omit 7 if it is clear from the context. 

The most important MMT rule for our purposes is the rule that permits 
adding an assignment to a theory morphism: If S contains a declaration c : Oi = 

O2, then a theory morphism v : 5— s-T = {a} may contain an assignment 
c 1-^ O only a \-T O : v{Oi) and O = ^(02). The according rule applies if c 
has no type or no definiens. Of course, this means that assignments c i— )■ O are 
redundant if c has a definiens; but it is helpful to state the rule in this way to 
prepare for our definitions below. 

Due to these rules, we obtain that if 7 h /i : 5 — ?> T and hs O : O' or 
hs O = O', then hr ^iiO) : ^l{0') and n{0) = /i(O'): respectively. Thus, 
typing and equality are preserved along theory morphisms. 

Due to the Curry-Howard representation, this includes the preservation of 
provability: \-t p ■ F states that p is a well-formed proof of F in T. And if S 
contains an axiom a : F, a morphism /i from S to T must map a to a T-object 
of type n{F), i.e., to a T-proof of This yields the well-known intuition 

of a theory morphism. In particular, if /i is the identity on those symbols that 
do not represent axioms, then h /i : 5 — > T implies that every S'-theorem is an 
T-theorem. 

MMT is parametric in the particular choice of type system — any type 
system can be used by giving the respective meta-theory. The type systems may 
themselves by defined in a further meta-theory. For example, many of our actual 
encodings are done with the logical framework LF |HHP93I as the ultimate 
meta-theory. The fiexibility to use MMT with or without a logical framework 
that takes care of all typing aspects is a particular strength of MMT. 

3 Integration Challenges 

In this section, we will develop some general intuitions about system integration 
and then give precise definitions in MMT. A particular strength of MMT is 
that we can give these precise definitions without committing to a particular 
foundational system and thus without loss of generality. 

The typical integration situation is that we have two systems Si for i — 1,2 
that implement a shared specification Spec. For example, these systems can be 
computer algebra systems or (semi-)automated theorem provers. Our integration 
goal is to move problems and results between Si and ^2. 
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Specifications and Systems Let us first assume a single system S implementing 
Spec, whose properties are given by logical consequence relations \\-spec and Ihg. 
We call S sound if Ihg F implies l^spec F for every formula F in the language 
of Spec. Conversely, we call S complete if W-Spec F implies Ih^ F . 

While these requirements seem quite natural at first, they are too strict for 
practical purposes. It is well-known that soundness fails for many CASs, which 
compute wrong results by not checking side conditions during simplification. 
Reasons for incompleteness can be theoretical — e.g., when 5 is a first-order 
prover and Spec a higher-order specification — or practical — e.g., due to re- 
source limitations. 

Moreover, soundness also fails in the case of underspecification: S is usually 
much stronger than Spec because it must commit to concrete definitions and im- 
plementations for operations that are loosely specified in Spec. A typical example 
is the representation of undefined terms (see |Far04j for a survey of techniques). 
If Spec specifies the rational numbers using in particular Vx.a; 7^ x/x = 1, 
and S defines 1/0 = 2/0 = 0, then S is not sound because 1/0 = 2/0 is not a 
theorem of Spec. 

We can define the above notions in MMT as follows. A specification Spec is 
an MMT theory; its meta-theory (if any) is called the specification language. A 
system implementing Spec consists of an MMT theory S and an MMT theory 
morphism v : Spec — > 5; the meta-theory of S (if any) is called the implementa- 
tion language. With this definition and using the Curry-Howard representation 
of MMT, we can provide a deductive system for the consequence relations used 
above: l^spec F iff there is a p such that \-Spec P ■ F; and accordingly for 

In the simplest case, the morphism v is an inclusion, i.e., for every symbol 
in Spec, S contains a symbol of the same name. Using an arbitrary morphism 
V provides more flexibility, for example, the theory of the natural numbers with 
addition and multiplication implements the specification of monoids in two dif- 
ferent ways via two different morphisms. 

Example 2. We use a theory for second-order logic as the specification language; 
it declares symbols for V, =, etc. Spec = Nat is a theory for the natural numbers; 
it declares symbols N, and succ as well as one symbol a : F for each Peano 
axiom F. 

For the implementation language, we use a theory ZF for ZF set theory; it 
has meta-theory first-order logic and declares symbols for set, G, 0, etc. Then 
we can implement the natural numbers in a theory S ~ Nat declaring, e.g., a 
symbol defined as 0, a symbol succ defined such that succ(n) = nU {n}, 
and prove one theorem a : F ~ p in S for each Peano axiom. Note that Nat 
yields theorems about the natural numbers that cannot be expressed in Spec, 
for example II-zf S 1. We obtain a morphism /ii : Nat Nat using N, 
1-^ etc. 

Continuing Ex. [U we obtain a different implementation fi2 '■ Nat — >■ Nat using 
A^ N, 1-^ etc. 

To capture practice in formal mathematics, we have to distinguish between 
the definitional and the axiomatic method. The axiomatic method fixes a formal 
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system L and then describes mathematical notions in L-theories T using free 
symbols and axioms. T is interpreted in models, which may or may not exist. 
This is common in model theoretical logics, especially first-order logic, and in 
algebraic specification. In MMT, T is represented as a theory with meta-theory 
L and with only undefined constants. In Ex. [2l L is second-order logic and T is 
Spec. 

The definitional method, on the other hand, fixes a formal system L together 
with a minimal theory Tq and then describes mathematical notions using defini- 
tional extensions T of Tq- The properties of the notions defined in Tq are derived 
as theorems. The interpretation of T is uniquely determined given a model of Tq. 
This is common in proof theoretical logics, especially LCF-style proof assistants, 
and in set theory. In Ex. [21 i is first-order logic. To is ZF, and T is S. 

Types of Integration Let us now consider a specification Spec and two implemen- 
tations : Spec Si. To simplify the notation, we will write h and instead 
of l~spec and hg^. We first describe different ways how to integrate Si and 52 
intuitively. 

Borrowing means to use Si to prove theorems in the language of S2. Thus, 
the input to Si is a conjecture F and the output is an expression \-i p : F. In 
general, since MMT does not prescribe a calculus for proofs, the object p can be 
a formal proof term, a certificate, proof sketch, or simply a yes/no answer. 

Computation means to reuse a iSi computation in Thus, the input of 
iSi is an expression t, and the output is a proof p with an expression t' such 
that l^i p : t = t' . To be useful, t' should be simpler than t in some way, e.g., 
maximally simplified or even normalized. 

Querying means answering a query in Si and transferring the results to 
This is similar to borrowing in that the input to Si is a formula F. However, 
now F may contain free variables, and the output is not only a proof p but also 
a substitution s for the free variables such that hi p : F[s]. 

In all translation / must be employed to translate the input from Si 

to 52. Similarly, we need a translation O in the opposite direction to translate 
the output t' and s and (if available) p from ^2 to Si . 

To define these integration types formally in MMT, we first note that borrow- 
ing is a special case of querying if F has no free variables. Similarly, computation 
is a special case of querying if F has the form t = X ior a. variable X that does 
not occur in t. 

To define querying in MMT, we assume a specification, 
two implementations, and morphisms / and O as on the 
right. / and O must satisfy O o I = ids^, O o ^i = /i2, and 
I o fj,2 — Hi. Then we obtain the following general form of 
an integration problem: Given an 52-context C and a query 
C 1-2? : F (where ? denotes the requested proof), find a substitution hi s : /(C) 
and a proof hi p : /(F) [s]. Then MMT guarantees that h2 0{p) : F[0{s)] so that 
we obtain 0{s) as the solution. Moreover, only the existence of O is necessary 
but not O itself — once a proof p is found in Si , the existence of O ensures that 
F is true in ^2 , and it is not necessary to translate p to ^2 . 
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We call the above scenario safe bidirectional communication between iSi and 
S2 because / and O are theory morphisms and thus guarantee that consequence 
and truth are preserved in both directions. This scenario is often implicitly as- 
sumed by people coming from the first-order logic community. Indeed, if Si and 
<S2 are automatic or interactive theorem provers for first-order logic, then the 
logic of the two systems is the same and both Si and ^2 are equal to Spec. 

If we are only interested in safe directed communication, i.e., transferring 
results from Si to 52, then it is sufficient to require only O. Indeed, often fi2 
is an inclusion, and the input parameters C and F, which are technically S2- 
objects, only use symbols from Spec. Thus, they can be moved directly to Spec 
and Si, and / is not needed. 

Similarly, the substitution s can often be stated in terms of Spec. In that case, 
O is only needed to translate the proof p. If the proof translation is not feasible, O 
may be omitted as well. Then we speak of unsafe communication because we do 
not have a guarantee that the communication of results is correct. For example, 
let Si and ^2 be two CASs, that may compute wrong results by not checking 
side conditions during simplification. Giving a theory morphism O means that 
the "bugs" of the system Si must be "compatible" with the "bugs" of 52 , which 
is quite unlikely. 

The above framework for safe communication via theory morphisms is partic- 
ularly appropriate for the integration of axiomatic systems. However, if Si and 
1S2 employ different mathematical foundations or different variants of the same 
foundation, it can be difficult to establish the necessary theory morphisms. In 
MMT, this means that Si and 52 have different meta-theories so that I and O 
must include a meta-morphism. Therefore, unsafe communication is often used 
in practice, and even that can be difficult to implement. 

Our framework is less appropriate if Si or ^2 are developed using the def- 
initional method. For example, consider Aczel's encoding of set theory in type 
theory |Acz98|Wer97] . Here Si Nat as in Ex. [21 and ^2 = Nat as in Ex. [TJ 
Azcel's encoding provides the needed meta-morphism I : ZF — CIC of O. But 
because Nat is definitional, we already have O = I, and we have no freedom to 
define O such that it maps the concepts of Nat to their counterparts in Nat. 
Formally, in MMT, this means that the condition O o fii = ^2 fails. Instead, we 
obtain two versions of the natural numbers in CIC: a native one given by fj,2 
and the translation of Nat given by O o fii. Indeed, the latter must satisfy all 
ZF-thcorems including, e.g., G 1, which is not even a well- formed formula over 
Nat. We speak oi faithful communication if O o = M2 can be established even 
when Si is definitional. This is not possible in MMT without the extension we 
propose below. 

4 A Framework for System Integration 

In order to realize faithful communication within MMT, we introduce partial 
theory morphisms that can filter out those definitional details of Si that need 



8 



not and cannot be mapped to S2- We will develop this new concept in general 
in Sect. 14.1] and then apply it to the integration problem in Sect. 14.21 

4.1 Partial Theory Morphisms in MMT 

Syntax We extend the MMT syntax with the production O ::= T. The intended 
use of T is to put assignments c i-> T into the body of a morphism v : S ^ 

T = {a} in order to make v undefined at c. We say that v filters c. The 
homomorphic extension v{—) remains unchanged and is still total: If O contains 
filtered symbols, then v{0) contains T as a subobject. In that case, we say v 
filters O. 

Semantics We refine the semantics as follows. A dependency cut D for an MMT 
theory T is a pair (Dtype, Ddef) of two sets of symbols accessible to T. Given 
such a dependency cut, we define dependency- aware judgments 7 h/j O : O' 
and 7 O = O' as follows. 7 O : O' means that there is a derivation of 
7 1-^0:0' that uses the rules 77 and T= at most for the constants in Dtype and 
Ddef , respectively, 'y'ro O = O' is defined accordingly. 

In other words, if we have ^' \-d O : O' and obtain 7' by changing the type 
of any constant not in Dtype or the definiens of any constant not in Ddef, then 
we still have 7' O : O' . Then a foundation consists of a foundational theory 
L together with dependency-aware judgments for typing and equality whenever 
T has meta-theory L. 

We make a crucial change to the MMT rule for assignments in a theory 
morphism: If S contains a declaration c : Oi = O2, then a theory morphism 
V : S* — >■ T = {a} may contain the assignment c n- O only if the following two 
conditions hold: (i) if Oi is not filtered by v, then O : v(Oi); (ii) if O2 is not 
filtered by v, then O = v{02)- The according rule applies if Oi or O2 are 
omitted. 

In |RK11| . a stricter condition is used. There, if Oi or O2 are filtered, then c 
must be filtered as well. While this is a natural strictness condition for filtering, 
it is inappropriate for our use cases: For example, filtering all L-symbols would 
entail filtering all iS-symbols. 

Our weakened strictness condition is still strong enough to prove the central 
property of theory morphisms: If 7 h : 5 T and h^i O : O' for some 
D = (Dtype, Ddef) and v does not filter 0,0', the type of a constant in Dtype, 
or the definiens of a constant in Ddef, then : The according 

result holds for the equality judgment. 

Finally, we define the weak equality of morphisms fii : S ^ T. We define 
I" Ml < in the same way as h /ii = ^2 except that hj- /ii(c) — /i2(c) is only 
required if c is not filtered by /ii. We say that h 77 : T —> 5 is a partial inverse 
of /I : S* T if h r/ o /i = ids and \- fi o rj < idT- 

Example 3. Consider the morphism /ii : Nat — s> Nat from Ex. [5] We build its 
partial inverse -q : Nat — > Nat = {a}. The meta-morphism I filters all symbols 
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of ZF, e.g., l{0) — T. Then the symbol N of Nat has fihered type and fihered 
definiens. Therefore, the conditions (i) and (ii) above are vacuous, and we use 
N M- A'' in cr. Then all remaining symbols of Nat (including the theorems) have 
filtered definiens but unfiltered types. For example, for : N = we have 
77(0) = T but 7?(N) = N. Therefore, condition (ii) is vacuous, and we map these 
symbols to their counterparts in Nat, e.g., using i— in ct. These assignments 
are type-preserving as required by condition (i) above, e.g., \-Nat : ?y(N). 



4.2 Integration via Partial Theory Morphisms 

The following gives a typical application of our framework by safely and faithfully 
communicating proofs from a stronger to a weaker system: 

Example 4- In [IRllj . we gave formalizations of Zermelo-Fraenkel (ZFC) set 
theory and Mizar's Tarski-Grothendieck set theory (TG) using the logical frame- 
work LF as the common meta-theory. ZFC and TG share the language of 
first-order set theory. But TG is stronger than ZFC because of Tarski's axiom, 
which implies, e.g, the sentence / stating the existence of infinite sets (which 
is an axiom in ZFC) and large cardinals (which is unprovable in ZFC). For 
example, we have an axiom a^a ■ I in ZFC, and an axiom tarski : T and a 
theorem too : / = -P in TC. Many TG-theorems do not actually depend on this 
additional strength, but they do depend on too and thus indirectly on tarski. 

Using our framework, we can capture such a theorem as the case of a TG- 
theorem l^u p : F where F is the theorem statement and too € Dtype but 
too ^ Ddef and tarski ^ Dtype. We can give a partial theory morphism v : 

TG ZFC = {...,too I— >■ aoo, . . .}. Then v does not filter p, and we obtain 
^ZFC v{p) ■ F. 

Assume now that we have two implementations /^i : Spec — > 
Si of Spec and partial inverses rji of /i^, where Si has meta- 
theory Li. This leads to the diagram on the right where 
(dashed) edges are (partial) theory morphisms. We can now ob- 
tain the translations /: 52 — > Si and O: iSi — >■ ^2 as / = /ii o r]2 
and O — /i2 o 7?i ■ Note that / and O arc partial inverses of each 
other. 

As in Sect.[3l let C I-2? : F be a query in S2. If ??2 does not filter any symbols 
in C or F, we obtain the translated problem 1(C) hi? : I{F). Let us further 
assume that there is an 5i-substitution hi s : I{C) and a proof hi p : I{F)[s\ 
such that p and s are not filtered by rji . Because / and O are mutually inverse and 
morphism application preserves typing, we obtain the solution h2 0(p) : F[0{s)]. 

The condition that r]2 does not filter C and F is quite reasonable in practice: 
Otherwise, the meaning of the query would depend on implementation-specific 
details of ^2 , and it is unlikely that Si should be able to find an answer anyway. 
On the other hand, the morphism 771 is more likely to filter the proof p. Moreover, 
since the proof must be translated from Li to L2 passing through Spec, the latter 
must include a proof system to allow translation of proofs. In practice this is 




10 



rarely the case, even if the consequence relation of Spec can be expressed as 
an inference system. For example, large parts of mathematics or the OpenMath 
content dictionaries implicitly (import) first-order logic and ZF set theory. 

We outline two ways how to remedy this; We can communicate filtered proofs 
or change the morphisms to widen the filters to let more proofs pass. 

Communicating Filtered Proofs Firstly, if the proof rules of Si are filtered by 
771, what is received by S2 after applying the output translation O is a filtered 
proof, i.e., a proof object that contains the constant T. T represents gaps in the 
proof that were lost in the translation. 

In an extreme case, all applications of proof rules become T, and the only 
unfiltered parts of 0{p) are formulas that occurred as intermediate results during 
the proof. In that case, 0{p) is essentially a list of formulas Fi (a proof sketch 
in the sense of |Wie03| ) such that I{Fi) A ... A /(i^j-i) hi I{Fi) for i = 1, . . . , n. 
In order to refine 0{p) into a proof, we have to derive hi F„. Most of the time, 
it will be the case that i^i, . . . ,-Fi-i \-2 Fi for all i, and the proof is obtained 
compositionally if 52 can fill the gaps through automated reasoning. When this 
happens, the proof sketch is already a complete declarative proof. 

Example 5. Let iSi and 52 be implementations of the rational numbers with 
different choices for division by zero. In Si, division by zero yields a special 
value for undefined results, and operations on undefined values yield undefined 
results; then we have the iSi-theorem t asserting Va,b,c.a{b/c) = {ab)/c. In ^2, 
we have n/0 = 1 and n%0 = n; then we have the iS2-theorems ^1,^2, ^3 asserting 
Vm, n.n = (n/m) * m + n%m, \fm.m/m = 1, and \/m.m%m = 0. 

The choice in ^2 reduces the number of case analyses in basic proofs. But t 
is not a theorem of S2] instead, we only have a theorem t' asserting Va, &, c.c ^ 
^ a(h/c) = {ab)/c. On the other hand. Si is closer to common mathematics, 
but the ti are not theorems of Si because the side condition m 7^ is needed. 

Hence, we do not have a total theory morphism O : Si — > 1S2, but we can give 
a partial theory morphism O that filters t. Now consider, for example, a proof p 
over iSi that instantiates t with some values A, B, C. When translating p to S2, 
t is filtered, but we can still communicate p, and 52 can treat 0{p) as a proof 
sketch. Typically, t is applied in a context where C 7^ is known anyway so 
that ^2 can patch 0{p) by using t' — which can easily be found by automated 
reasoning. 

Integration in the other direction works accordingly. 

Widening the Filters An alternative solution is to use 
additional knowledge about Si and ^2 to obtain a 
translation where 0{p) is not filtered. In particular, 
if p is filtered completely, we can strengthen Spec by Spec 
adding an inference system for the consequence rela- 
tion of Spec, thus obtaining Spec'. Then we can extend 
the morphisms fii accordingly to fj.'^ , which amounts to 
proving that Si is a correct implementation of Spec. 
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Now rji can be extended as well so that its domain becomes bigger, i.e., the 
morphism 771 and thus O filter less proofs and become "wider" . 

Note that we are flexible in defining Spec as required by the particular choices 
of ii and L2. That way the official specification remains unchanged, and we can 
maximize the filters for every individual integration scenario. 

Example 6 ( Continuing Ex. A typical situation is that we have a theorem 
F over Nat whose proof p uses the Peano axioms and the rules of first-order 
logic but does not expand the definitions of the natural numbers. Moreover, if 
a : A — P \s a. theorem in Nat that establishes one of the Peano axioms, then p 
will refer to a, but will not expand the definition of a. Formally, we can describe 
this ds.\-D P '■ F where 0, a G Dtype but 0, a ^ Ddef- 

We can form Spec by extending Spec with proof rules for first-order logic 
and extend 77 to t]' accordingly. Since 77 does not filter the types of and a, we 
obtain a proof h^pec "n'^P) '■ v'i^) due to the type-preservation properties of our 
partial theory morphisms. Despite the partiality of rj', the correctness of this 
proof is guaranteed by the framework. 

Both ways to integrate systems are not new and have been used ad hoc in 
concrete integration approaches, see Sect. [S] With our framework, we are able 
to capture them in a rigorous framework where their soundness can be studied 
formally. 

5 Related Work 

The MoWGLI project |MoW04] introduced the concept of "semantic markup" 
for specifications in the calculus of construction as distinct from the "content 
markup" in OpcnMath and OMDoc. This corresponds closely to the use of 
meta-theories in MMT: "content markup" corresponds to MMT theories with- 
out meta-theory; and "semantic markup" corresponds to MMT theories with 
meta-theory CIC. 

A framework very similar to ours was given in [CFW03) . Our MMT theo- 
ries with meta-theory correspond to their biform theories, except that the latter 
adds algorithms. Our theory morphisms / and O correspond to their transla- 
tions export and import. The key improvement of our framework over jCFW03] 
is that, using MMT's meta-theories, the involved logics and their consequence 
relations can be defined declaratively themselves so that a logic-independent 
implementation becomes possible. Similarly, using logic morphisms, it becomes 
possible to implement and verify the trustability conditions concisely. 

Integration by borrowing is the typical scenario of integrating theorem provers 
and proof assistants. For example, Leo-II [BPTF08] or the Sledgehammer tactic 
of Isabelle |MP08j (^2) use first-order provers (Si) to reason in higher-order logic. 
Here the input translation / is partial inverse of the inclusion from first-order 
logic to higher-order logic. A total translation from modal logic to first-order 
logic is used in jHSOOj . In all cases, the safety is verified informally on the meta- 
level and no output translation O in our sense is used. But Isabelle makes the 
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communication safe by reconstructing a proof from the proof (sketch) returned 
by the prover. 

The above systems are called on demand using an input translation /. Alter- 
natively a collection of 5i-proofs can be translated via an output translation O 
for later reuse in ^2 ; in that case no input translation / is used at all. Examples 
are the translations from Isabelle/HOL in HOL Light |McL06] . from HOL Light 
to Isabelle/HOL (OS06] . from HOL Light to Coq |KW10| . or from Isabelle/HOL 
to Isabelle/ZF |KS10j . The translation from HOL to Isabelle/HOL is notable 
because it permits faithful translations, e.g., the real numbers of HOL can be 
translated to the real numbers of Isabelle/HOL, even though the two systems 
define them differently. The safety of the translation is achieved by recording 
individual iSi-proofs and replaying them in This was difficult to achieve even 
though Si and 52 are based on the same logic. 

The translation given in |KW10| is the first faithful translation from HOL 
proofs to CIC proofs. Since the two logics are different, in order to obtain a total 
map the authors widen the filter by assuming additional axioms on CIC (ex- 
cluded middle and extensionality of functions) . This technique is not exploitable 
when the required axioms are inconsistent. Moreover, the translation is subop- 
timal, since it uses excluded middle also for proofs that are intuitionistic. To 
improve the solution, we could use partial theory morphisms that map case 
analysis over boolean in HOL to T, and then use automation to avoid excluded 
middle in CIC when the properties involved are all decidable. 

In all above examples but }KW10j . the used trans lations are not verified 
within a logical framework. The Logosphere jPSK+QS" project used the proof 
theoretical framework LF to provide statically verified logic translations that 
permit inherently safe communication. Here the dynamic verification of trans- 
lated proofs becomes redundant. The most advanced such proof translation is 
one from HOL to Nuprl f NSMOlj . 

The theory of institutions |GB92| provides a general model theoretical frame- 
work in which borrowing has been studied extensively }CM97| and implemented 
successfully IMMLOTj . Here the focus is on giving the morphism / explicitly and 
using a model theoretical argument to establish the existence of some O; then 
communication is safe without explicitly translating proofs. 

Integration by computation is the typical scenario for the integration of com- 
puter algebra systems, which is the main topic of the Calculemus series of confer- 
ences. For typical examples, see jDM05| where the computation is performed by 
a CAS, and [AT07) where the computation is done by a term rewriting system. 
Communication is typically unsafe. Alternatively, safety can be achieved if the 
results of the CAS — e.g., the factorization of a polynomial — can be verified 
formally in a DS as done in |HT98| and jSorOOj . 

Typical applications of integration by querying are conjunctive query answer- 
ing for a description logic. For example, in [TSP08| . a first-order theorem prover 
is used to answer queries about the SUMO ontology. 

The communication of filtered proofs essentially leads to formal proof sketch 
in the sense of [Wic03j . The idea of abstracting from a proof to a proof sketch 
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corresponds to the assertion level proofs used in [MeiOOj to integrate first-order 
provers. The recording and replaying of proof steps in |0SQ6] and the reconstruc- 
tion of proofs in Isabelle are also special cases of the communication of filtered 
proofs. 



6 Conclusion 

In this paper we addressed the problem of preserving the semantics in protocol- 
based integration of mathematical reasoning and computation systems. We ana- 
lyzed the problem from a foundational point of view and proposed a framework 
based on theory graphs, partial theory morphisms, and explicit representations 
of meta-logics that allows to state solutions to the integration problem. 

The main contribution and novelty of the paper is that it paves the way 
towards a theory of integration. Theoretically, via filtering, this theory could be 
able to combine faithfulness with static verification, which would be a major step 
towards the integration and merging of system libraries. Moreover, we believe it 
is practical because it requires only a simple extension of the MMT framework, 
which already takes scalability issues very seriously |KRZ10j . 

We do not expect that our specific solution covers all integration problems 
that come up in practice. But we do expect that it will take a long time to 
exhaust the potential that our framework offers. 
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